View PDSP16488AMA_1468536.PDF datasheet online --- IC-ON-LINE

Datasheet File OCR Text:

pdsp16488a ma 1 features 8 16 8 16 8 16 1 1 1 1 1 2 1 2 2 4 4 - 1 2 2 4 4 - 4 - 6 - - - 4 - 6 - - - 4 - 8 - - - 9 - - - - - pixel 3x3 5x5 7x7 9x9 11x11 15x15 23x23 size 10mhz 10mhz 20mhz 20mhz 40mhz 40mhz max pixel rate window size * * * maximum rate is limited to 30 mhz by line store expansion delays table 2 devices needed to implement typical window sizes table 1 single device configurations 40mhz 20mhz 10mhz 20mhz 10mhz 8 8 8 16 16 4 8 8 4 8 4 4 8 4 4 4x1024 4x1024 8x512 4x512 4x512 data size window size width x depth max pixel rate line delays the pdsp16488a is a fully integrated, application spe- cific, image processing device. it performs a two dimensional convolution between the pixels within a video window and a set of stored coefficients. an internal multiplier accumulator array can be multi-cycled at double or quadruple the pixel clock rate. this then gives the window size options listed in table 1. an internal 32k bit ram can be configured to provide either four or eight line delays. the length of each delay can be programmed to the users requirement, up to a maximum of 1024 pixels per line. the line delays are arranged in two groups,which may be internally connected in series or may be configured to accept separate pixel inputs. this allows inter- laced video or frame to frame operations to be supported. the 8 bit coefficients are also stored internally and can be downloaded from a host computer or from an eprom. no additional logic is required to support the eprom and a single device can support up to 16 convolvers. the pdsp16488a contains an expansion adder and delay network which allows several devices to be cascaded. convolvers with larger windows can then be fabricated as shown in table 2. intermediate 32 bit precision is provided to avoid any danger of overflow, but the final result will not normally occupy all bits. the pdsp16488a thus provides a multiplier in the output path, which allows the user to align the result to the most significant end of the 32 bit word. fig. 1 typical , stand alone, real time system delayed sync pixel clock generator sync extract addr data clk sync bypass aux data eprom output data power on reset res a/d converter optional field store composite data in pdsp 16488a convolver the pdsp16488a is a fully compatible replacement for the pdsp16488 8 or 16 bit pixels with rates up to 40 mhz window sizes up to 8 x 8 with a single device eight internal line delays supports interlace and frame to frame operations coefficients supplied from an eprom or remote host expandable in both x and y for larger windows gain control and pixel output manipulation 132 pin qfp rev a b c d date mar 1993 jul 1996 jan1997 note polyimide is used as an inter-layer dielectric and as glassivation. polymeric material is also used for die attach which according to the requirement in paragraph 1.2.1.b. (2) precludes catagorising this device as fully compliant. in every other respect this device has been manufactured and screened in full accordance with the requirements of mil-std 883 (latest revi- sion). change notification the change notification requirements of mil-prf-38535 will be implemented on this device type. known customers will be notified of any changes since the last buy when ordering further parts if significant changes have been made. pdsp16488a ma single chip 2d convolver with integral line delays supersedes january 1997 version, ds3742 - 3.1 ds3742 - 5 .0 no ve mb er 2000
pdsp16488a ma 2 fig. 2 functional block diagram d15:0 data out ip7:0 bin by pass control mux scaler 4 line dlys 3 line dlys 1 line delay adder control registers comparato r clock multi purpos e data bus x15:0 oen over flow y delay x delay coefficient store (64) 8 x 8 array o f mac's ce ds r/w pc0 pc1 re s cs3:0 prog master single delop l7:0 y delay pin out table (84 pin pga - ac84) pin no ac packag e functio n pin no ac packag e functio n pin no ac packag e functio n pin no ac packag e functio n l0 f1 l1 l2 l3 spare l4 l5 l6 l7 ip7 spare ip6 ip5 ip4 spare ip3 ip2 ip1 ip0 bypass a1 b1 c2 c1 d2 d1 e2 e1 f2 g2 g1 h2 j1 j2 k1 k2 l1 l2 m1 n1 n2 x15 x14 x13 spare single x12 x11 maste r x10 x9 x8 x7 x6 x5 x4 x3 x2 x1 x0 delop pc0 m3 n3 m4 n4 m5 n5 m6 m7 n7 m8 n9 m9 n10 m10 n11 m11 n12 n13 m13 l12 l13 res cs0 cs1 cs2 cs3 prog ds ce r/w hres ov pc1 bin oen d0 d1 d2 d3 d4 d5 d6 k12 k13 j12 j13 h12 g12 g13 f12 e13 e12 d13 d12 c13 c12 b13 a13 a12 b11 a11 b10 a10 d7 d8 clk spare d9 d10 d11 spare d12 d13 d14 d15 f0 vdd vdd vdd vdd gnd gnd gnd gnd b9 a9 b8 b7 a7 b6 a5 b5 a4 b4 a3 b3 a2 f1 n6 f13 a6 h1 n8 h13 a8
pdsp16488a ma 3 type input i/o input input dual function output output input output i/o input input i/o input output output input input input input outputs outputs supply description pixel data input to the first line delay. [most significant byte in 16 bit mode] pixel data input to the second group of line delays. [least significant byte in 16bit mode]. alternatively an output from the last line delay when the appropriate mode bit is set. the first line delay in the first group is bypassed when this input is active. (high). no internal pull up. resets the line delay address pointers when high. normally the composite sync signal in real time applications. in non real time systems it defines a frame store update period, when low. address/data connections from a master or single device to the external coefficient source, with x15 defining eprom or host support. otherwise they provide the expansion data input. signed 16 bit scaled data or multiplexed 32 bit intermediate data. during intermediate transfers the most significant half is valid when the clock is low, and the least significant half when clock is high. during programming a master device outputs a timing strobe on this pin. this is passed down the chain in a multiple device system, using the pc0 input on the next device. this pin is used in conjunction with pc1 in multiple device systems. it terminates the write strobe from a master device which is eprom supported. this output provides a version of the hres input which has been delayed by an amount defined by the user. the data strobe from a host computer. active low. this pin will be an output from an eprom supported master device which provides strobes to the remaining devices. an active low enable which is internally gated with r/ w and ds to perform reads or writes to the internal registers. in a single or master device, which is supported from an eprom, the bottom 72 addresses are always used and ce is not needed. ce can then be used to initiate a new register load sequence after the power on load sequence. read / not write line from the host cpu. when an eprom is used this pin should be tied low. this pin is normally an input which signifies that registers are to be changed or examined. it is, however, an output from an eprom supported single or master device indicating to the rest of the system that registers are being updated. clock. all events are triggered on the rising edge of the clock, except the latching of least significant expansion inputs . internally the clock can be multiplied by two or four in order to increase the effective number of multipliers. this output indicates the result from the internal comparison. a high value indicates that the pixel was greater than the internal threshold. the output is only valid from the last device in a chain. when high this output indicates that there has been a gain control overflow. active low power on reset signal. tied to ground to indicate a single device system. internal pull up resistor. tied to ground to indicate the master device in a multiple device system. must be left open circuit in a single device system. internal pull up. output enable signal. active low. four address bits from a master specifying one of sixteen devices in a multiple device system. must be externally decoded to provide chip enables for the additional devices. these bits indicate the field selection given by the auto select logic. the same coding as that used for control register bits c5:4 is used. four power and ground pairs. all must be connected. name ip7:0 l7:0 bypass hres x15:0 d15:0 pc1 pc0 delop ds ce r/ w prog clk bin ov res single master oen cs3:0 f1:0 vcc / gnd
pdsp16488a ma 4 basic operation the pdsp16488a convolver performs a weighted sum of all the pixels within an n x n two dimensional window. each pixel value is multiplied by a signed coefficient, or weight, and the products are summed together. in practice positive weights would be used to produce averaging effects, with various distribution laws, and negative weights would be used for edge enhancement. the window is moved continuously over the video frame, and for real time operation a new result must be obtained for every pixel clock. in most applications odd sized windows will be used, resulting in a centre pixel whose value is modified by the surrounding pixels. output accuracy with 8 bit pixels, and an 8 x 8 window, it is possible for the accumulated sum to grow to 22 bits within a single device. with 16 bit pixels, and an 8 x 4 window ( the maximum possible ), the sum can grow to 29 bits. the pdsp16488a actually allows for word growth up to 32 bits, and thus allows several devices to be cascaded without any danger of over- flow. since coefficients can be negative, the final result is a 32 bit signed two's complement number. in a particular application the desired output will lie somewhere within these 32 bits, the actual position being dependent on the coefficient values used. this causes prob- lems in physically choosing which output pins to connect to the rest of the system. to overcome this problem the pdsp16488a contains an output multiplier, or gain control, which allows the final result to be aligned to the most signifi- cant end of the 32 bit internal result.the provision of a multiplier, rather than a simple shifter, allows the gain to be defined more accurately. the sixteen most significant bits of the adjusted result are available on output pins, and contain a sign bit. output saturation if the output from the convolver is driving a display, negative pixels will give erroneous results. an option is thus provided which forces all negative results to zero, which are then interpreted as black by the display. at the same time positive results, which overflow the gain control, are forced to saturate at the most positive number ie peak white. in this mode the output sign bit is always zero,and should not be connected to an a/d converter. a separate option forces both negative and positive overflows to saturate at their respective maximum values, but in scale negative results remain valid. a gain control overflow warning flag is also available, which can be used in a host cpu supported system to change the gain parameters if overflows are not acceptable. binary output the pdsp16488a contains a 16 bit arithmetic com- parator which allows the output from the gain control to be compared with a previously programmed value. an output flag allows the user to detemine if the result was above or below a value contained within an internal register. multiplier array the pdsp16488a contains sixteen 8x8 multipliers each producing a 16 bit result. internally the pixel clock supplied by the user can be multiplied by two or four, which together with the proprietary architecture, allows each multi- plier to be used several times within a pixel clock period. this increases the effective number of multipliers, which are avail- able to the user, from 16 to 32 or 64 respectively. this architecture produces a very efficient utilization of chip area, and allows the line delays to be accommodated on the same device. the sixteen multipliers are arranged in a 4 deep by 4 wide array, resulting in effective arrays of 4 by 8 or 8 by 8 with the multi-cycling options. the multiplier array can also be configured to handle 16 bit signed pixels; the effective number of available multipliers is then halved. line delay operation internal ram is arranged in two separate groups, and can be configured to provide line delays to match the chosen size of the convolver. when a four deep arrangement is used, with 8 bit pixels, four line delays are available, and each can be programmed to contain up to 1024 pixels. in an eight deep array, or if16 bit pixels are needed, each line can contain up to 512 pixels. figure 4 illustrates the options available. the first line delay in one of the groups can optionally be switched in or out under the control of an input pin. it is used to delay the pixel input when data is obtained from another convolver in a multiple device system, or it is used to support interlaced video. signals l7:0 may be used as pixel inputs or outputs. they are configured as inputs at power-on to avoid possible bus conflicts, but by setting a mode control bit can become outputs. they can then be used to drive another device when multiple pdsp16488a's are required. interlaced video when using real time interlaced video, a picture or frame is composed from two fields, with odd lines in one field and even lines in the other. an external field delay is thus required to gather information from adjacent lines, and the convolver needs two input busses. the bus providing the delayed pixels has an extra internal line delay. this is only used in the field containing the upper line in any pair of lines, and must be bypassed in the other field. it ensures that data from the previous field always corresponds to the line above the present active line, and avoids the need to change the position of the coefficients from one field to the next. figure 3 shows the translation from physical to internal line positions, for single device interlaced systems. line n is the line presently being convolved, which is either one or two lines previous to the line presently being produced. when windows requiring four or more lines are to be implemented, the first line delay, in the group supplied from the l7:0 pins, must always be by-passed. this by-pass option is controlled by register b, bit 7 and is not effected by the bypass input pin.. the coefficients must be loaded into the locations shown, which match the translated line positions, with unused coefficients, shown shaded, loaded with zero's.
pdsp16488a ma 5 figure 3. line delay allocations in single device interlaced systems * * odd field n+4 512 512 512 512 512 512 512 512 8 x 8 array output is shifted by 1 line in every field field delay video line n+2 odd field n+1 n-1 n n-2 delay is by-passed [reg b,bit 7 is set] n+2 ip7:0 l7:0 512 512 512 512 512 512 512 512 8 x 8 array output is shifted by 2 lines in every field field delay video line n+4 n+3 n+1 n n-2 n-1 n-3 * n+2 ip7:0 l7:0 1024 1024 1024 1024 4 x 4 or 8 x 4 array field delay video line n+2 n+1 n - 1 output is shifted by 1 line in every field n line n-2 line n-1 line n line n+1 line n+2 5 x 5 window c0 c1 c2 c3 c4 c32 c33 c34 c35 c36 c40 c41 c42 c43 c44 c8 c9 c10 c11 c12 c48 c49 c50 c51 c52 line n-3 line n-2 line n-1 line n line n+1 line n+2 line n+3 c48 c49 c50 c51 c52 c53 c54 c40 c41 c42 c43 c44 c45 c46 c8 c9 c10 c11 c12 c13 c14 c16 c17 c18 c19 c20 c21 c22 c32 c33 c34 c35 c36 c37 c38 c0 c1 c2 c3 c4 c5 c6 c30 c29 c28 c27 c26 c25 c24 8 x 8 window line n-1 line n line n+1 3 x 3 window c4 c5 c9 c6 c10 c2 c0 c1 c8 ip7:0 l7:0 delay is by-passed [reg b,bit 7 is set] * c31 c56 c57 c58 c59 c60 c61 c62 c63 c23 c55 c15 c47 c7 c39 line n+4 odd field
pdsp16488a ma 6 defining the length of the line delay figure 4 defines the maximum line lengths available in each of the window size options. the actual line lengths can be defined in one of three ways, to support both real time applications, taking pixels directly from a camera, and also use in systems supported by a frame store. in the former case the line delays must be referenced to video synchronization pulses. in the latter case the line lengths are well defined, and the horizontal flyback 'dead times' will have been removed. to support real time applications an option is provided in which the length of the line delay is defined by the number of clocks obtained whilst an input pin ( hres ) is in-active. hres would normally be composite sync when the convolver is directly attached to an ntsc or pal video camera. conceptually, the line delay is achieved by reading the previous contents of a ram based line store, and then writing new information to the same address. when hres is active write operations are inhibited, and the address counter is reset. during an active line the counter is incremented by the pixel clock. if the maximum count is reached before the end of a line, then write operations are terminated and wrap-around effects avoided. the active going edge of hres, marking the end of a line, is normally asynchronous to the pixel clock, and it is possible for an additional pixel to be stored on some lines. this has no effect on the convolver operation, and will not cause a cumulative shift in the pixel position from line to line. fig. 4. line delay configurations an alternative means of defining the line length is, however, provided when an exact number of pixels is needed. hres going in-active then starts the delay operation for every line, but it ceases when the 10 bit value contained in two registers is reached. this method can avoid the need to store blank pixels at the end of a line before sync goes active. with this method the line must contain an even number of pixels, but the value loaded into the control registers defining the line length, must be one less than the even number needed. in an image processing system, the pixel clock is often re-synchronized, or even inhibited, during blanking or sync. the next line is then started with a precise time interval from the end of sync to the first pixel clock edge. this avoids any visible pixel jitter at the beginning of the line, which would otherwise be present since pixel clock is asynchronous with respect to video sync pulses. when using the pdsp16488a the pixel clock should not be inhibited, or re-synchronized, until the delayed version of the hres input goes active. this is present on the delop output pin. this will ensure that no pixels on the right hand edge are lost due to the internal pipeline delay. if the pixel clock is a continuous signal, the user must ensure that the hres in-active transition meets the timing requirements defined in figure 10. the active going edge at the end of a line need not be synchronized. when pixels are read/written to a frame store, an alternative line delay configuration is needed. within the frame store lines would be stored in contiguous locations, with no gaps caused by the flyback period between the lines. this method of use makes the hres defined line delay operation difficult to use, and an alternative mode of operation is provided. the hres input is then driven by a system provided signal, which defines a complete frame store update period. it is not a line defining signal. the high to low transition of this signal will initiate the line store update sequence and allow the internal address pointers to increment. these point- ers will be synchronously reset at the end of a line, when they reach the pre-programmed value. they will then immediately start a new operation using address zero. the actual line delay must be pre-loaded into two control registers as described previously. write operations back to the frame store must allow for the total pipeline delay. this can be achieved by inhibiting write operations until the delayed version of hres goes low at the delop output pin. write operations then continue until it goes back high. the pdsp16488a assumes that data is valid when a clock signal is applied, and that it also meets the set up and hold requirements given in figure 10. if data is not valid, due for example to a frame store dram refresh cycle, then the user must externally inhibit the clock. the clock supplied to the convolver will in this mode be a signal which defines a frame store cycle time. the use of the convolver in a line scan system is similar to its use with a frame store. these systems have no flyback period, and the address counter must be synchronously reset at the end of the line and then allowed to continue. gain control the gain control is provided as an aid to locating the bits of interest in the 32 bit internal result. the magnitude of the largest convolved output will depend on the size of the 512 512 512 512 512 4x4 or 8x4 512 512 l7:0 ip7:0 1024 1024 1024 4 x 4 or 8 x 4 array bypass ip7:0 l7:0 512 512 512 512 512 512 512 bypass ip7:0 l7:0 512 512 512 512 512 512 512 512 8x8 array bypass l7:0 ip7:0 512 16 16 16 16 1024 512 8x8 array bypass 1024 1024 1024 1024 4 x 4 or 8 x 4 array bypass ip7:0 l7:0
pdsp16488a ma 7 window, and the coefficient values used. the function of the gain control is then to produce an output, which is accurate to 16 bits, and which is aligned to the most significant end of this 32 bit word. the sixteen most significant bits of the word are available on output pins, and the largest number need only have one sign bit if the gain control is correctly adjusted. fiigure 5 indicates the mechanism employed with the required function implemented in two steps. two mode control bits allow one of four 20 bit fields to be selected from the final 32 bit value. these four fields are positioned with the first at the most significant end, and then at four bit displacements down to the least significant end. by setting an enabling bit, the field selection can optionally be done automatically. this feature should only be used in the real time operating mode, when hres defines video lines. internal logic examines the most significant 13, 9, or 5 bits from the 32 bit result, and makes a field selection dependent on which group does not contain identical sign bits. if less than five sign bits are obtained, the logic will select the field containing the most significant 20 bits. the automatic selection is particularly useful when a fixed scene is being processed. the selection is reset when any internal register is updated ( ie prog has been active ) and is then held in-active for ten further occurances of the hres input. this allows the internal multiplier/ accumulator array to be completely flushed before a field selection is made. as convolver outputs of greater magnitude are produced the field selection logic will respond by selecting a more significant field. the most significant field found necessary remains selected until prog again goes active. even if the automatic field selection is not enabled, two outputs, f1:0, will still indicate which field would have been selected. these are coded in the same way as register c, bits 5:4. having chosen a field, either manually or automati- cally, it is then multiplied by a 4 bit unsigned integer. this is contained within a user programmed register, and the multi- plication will produce a 24 bit result . the middle 16 bits of this result contain the required output bits. the gain control multi- plier can overflow in to the unused most significant four bits if the parameters are chosen wrongly. this condition is indi- cated by an overflow flag . by setting appropriate mode control bits, further ma- nipulation of the gain control output is possible. one option allows all negative outputs to be forced to zero, and at the same time positive gain control overflows will saturate at the maximum positive number. a different option will saturate positive and negative overflows at their respective maximum values, but otherwise leaves them unchanged. occasional overflows can be tolerated in some systems, and this option prevents any gross errors. expansion multiple devices can be connected in cascade in order to fabricate window sizes larger than those provided by a single device. this requires an additional adder in each device which is fed from expansion data inputs. this adder is not used by a single device or the first device in a cascaded system, and can be disabled by a mode control bit. the first device in the cascaded system must be designated as a master device by tying an input pin low. its expansion input bus is then used as the source of data for the coefficient and control registers in all devices in the system. in order to reduce the pin count required for 32 bit busses, both expansion in and data out are time multiplexed with the phases of the pixel clock. when the clock is high the least significant half will be valid, and when the clock is low the most significant half will be valid. in practice this multiplexing is only possible with pixel clocks up to 20mhz. above these frequencies the multiplexing must be inhibited by setting a mode control bit ( register a, bit 7 ). the intermediate data accuracy will then be reduced, since only the lower 16 bits of the internal 32 bit intermediate sum are available on the output pins. in such systems the coefficients must be scaled down in order to keep the intermediate and final results down to 16 bits. the final device should not use the gain control, and instead should simply output the non-multiplexed 16 bit result. the overflow flag and pixel saturation options will not be available. pixel input and output delays in a real time system, when line delays are referenced to video sync pulses present on the hres input, the first pixel from the last line delay does not appear on the l7:0 pins until the fifth active pixel clock edge after hres has gone low. this is illustrated in figure 7. in a vertically expanded system, this output provides the input to the first line delays in the vertically displaced devices. the internal logic is thus designed to always expect this five clock delay. compensation must thus be applied to the devices which are directly connected to the video source, such that the first pixel is not valid until the fifth clock edge. for this reason the pdsp16488a contains an optional four clock pipeline delay on each of the pixel data inputs. when the delay is used the first pixel in a video line must be available on the input pins after the first pixel clock edge. this would be so if the device were connected to an a/d converter, since that would introduce a one pixel pipeline delay. if the system introduces any further external pipeline delays, then the internal delay should be bypassed, and the user should ensure that the first pixel is valid after the fifth clock edge. the use of this four clock delay is controlled by bit 3, in control register b. this delay is in addition to the delays which are provided to support expansion in both the x and y directions, and are controlled by register d, bits 3:2. both delays are in fact simply added together in the device, but are provided for conceptually different reasons. fig. 5. gain control operation msb lsb d15:0 from expansion adder 32 bits x 20 4 saturate logic mux gain register 16 24 4 4 20 12 20 20 20 488412
pdsp16488a ma 8 delay compensation for large windows a large window is composed of several partial windows each of which is implemented in an individual device. if necessary the partial window must be padded with zero coefficients to become one of the standard sizes. when constructing a large window it is necessary to delay the expansion data inputs in order to compensate for growth in the horizontal direction. delays in the partial sums are also necessary to compensate for the total pipeline delay needed to produce the previous complete horizontal stripe. within each device in a horizontal stripe, apart from the first, the expansion input must be delayed by the width of the partial window, before it is added to the internal sum. since partial windows can only be 4 or 8 pixels wide,a delay of 4 or 8 pixel clocks is needed. there is, however, an in-built delay fig. 6. multi-device delay paths 0 delays line delays 4 clock delay 4 clock delay 4 delays 0 delays b3 = 1 d3:2 = 00 d0 = 0 0/4 delays line delays 4 clock delay 4 delays 0 delays b3 = 1 d3:2 = 00 d0 = 0 or 1 zero width = s 0 if s = 4, 4 if s = 8 width = s n th device in the row input 0/4 delays line delays 4 clock delay 0 delays d delays b3 = 0 0 delays line delays 4 clock delay 4 clock delay 0 delays d delays b3 = 0 d0 = 0 d = 4+s(n-1) defined by d3:2 width = s 0 if s = 4, 4 if s = 8 width = s n th device in the row 0/4 delays line delays 4 clock delay 0 delays d delays b3 = 0 d0 = 0 or 1 0 delays line delays 4 clock delay 0 delays d delays b3 = 0 d0 = 0 width = s 0 if s = 4,4 if s = 8 width = s n th device in the row output d = 4+s(n-1) defined by d3:2 d = 4+s(n-1) defined by d3:2 d = 4+s(n-1) defined by d3:2 d0 = 0 or 1 of 4 pixels in the inter device connection, and the pdsp16488a thus only needs an option to delay the expansion input by an additional four pixels. the data from the last device in a horizontal row of convolvers feeds the expansion input of the first device in the next row. this is shown in figure 6. with this arrangement, the position of the partial window as illustrated, is the inverse of its vertical position on a normal tv screen. thus the top, left hand, device corresponds to the bottom, left hand, portion of the complete window. the output from the last device in the row is delayed with respect to the original data input by an amount given by the formula; delay = 4 + [n-1].s where n is the number of devices in a row and s is the partial window width, ie 4 or 8.
pdsp16488a ma 9 fig.7 pixel input delays table 3 internal register addressing ta ble 4 pipe line dalays function mode reg a mode reg b mode reg c mode reg d comparator lsb comparator msb scale value pixels / line lsb pixels / line msb hex. addr 00 01 02 03 04 05 06 07 08 c0 - c15 c16 - c31 c32 - c47 c48 - c63 unused 40 - 4f 50 - 5f 60 - 6f 70 - 7f 09 - 3f 2345678 12 7 6 last 2 pixels intern- ally stored line store writes inhibited active line period asynchronous back edge hres clock set up time [sync] first pixel valid [b3 set] first pixel from line store valid configurations when the gain control is used. these delays are the the internal processing delays and do not include the delays needed to move a given size window completely into a field of interest. when multiple devices are needed, addi- tional delays are produced which must be calculated for the particular application. these delays are discussed in the applications section. the pdsp16488a contains facilities for outputing a delayed version of hres to match any processing delay. control register bits allow this delay to be selected from any value between 29 and 92 pixel clocks. the internal convolver sums, in each of the devices in the next row, must be delayed by this amount before they are added to results from the previous row. this is more conven- iently achieved by delaying data going into the line stores. the required cumulative delay with respect to the first horizontal stripe is then automatically obtained when more than two rows of devices are needed. two bits in control register d are used to define one of four delay options. these delays have been selected to support systems needing from two to eight devices and are described in the applications section. coefficients sixty-four coefficients are stored internally and must be initially loaded from an external source. table 3 gives the coefficient addresses within a device, with coefficent c0 specified by the least significant address and c63 by the most significant address. table 5 shows the physical window posi- tion within the device which is allocated to each coefficient in the various modes of operation. horizontally the coefficient positions correspond to the convolution process as if it were conceptually observed on a viewing screen, ie the left hand pixel is multiplied with c0. in the vertical direction the lines of coefficients are inverted with respect to a visual screen, ie the line starting with c0 is actually at the bottom of the visualized window. the coefficients may be provided from a host cpu using conventional addressing, a read/write line, data strobe, and a chip enable. alternatively, in stand alone systems, an eprom may be used. a single eprom can support up to 16 devices with no additional hardware. when windows are to be fabricated which are smaller than the maximum size that the device will provide in the required configuration, then the areas which are not to be used must contain zero coefficients. the pipeline delay will then be that of a completely filled window. total pipeline delay the total pipeline delay is dependent on the device configuration and the number of devices in the system. table 4 gives the delays obtained with the various single device data size window size pipeline dela y 8 8 8 16 16 34 30 26 28 26 4x4 8x4 8x8 4x4 8x4
pdsp16488a ma 10 table 5 physical coefficient position note two coefficients occuring in the s ame box have identical values 1024 1024 1024 1024 c0 c1 c2 c3 c4 c5 c6 c7 c15 c23 c31 c8 c16 c24 c9 c10 c11 c12 c13 c14 c17 c18 c19 c20 c21 c22 c25 c26 c27 c28 c29 c30 8x4, 8 bit data 512 512 512 512 512 512 512 512 16 16 16 16 c0 c16 c4 c20 c8 c24 c12 c28 c1 c17 c2 c18 c3 c19 c5 c21 c6 c22 c7 c23 c9 c25 c10 c26 c11 c27 c13 c29 c14 c30 c15 c31 4x4, 16 bit data c0 c4 c8 c12 c1 c2 c3 c5 c6 c7 c9 c10 c11 c13 c14 c15 4x4, 8 bit data co c32 c1 c33 c2 c34 c3 c35 c4 c36 c5 c37 c6 c38 c7 c39 c15 c47 c23 c55 c31 c63 c8 c40 c16 c48 c24 c56 512 512 512 512 512 512 512 512 16 16 16 16 c9 c41 c10 c42 c11 c43 c12 c44 c13 c45 c14 c46 c17 c49 c18 c50 c19 c51 c20 c52 c21 c53 c22 c54 c25 c57 c26 c58 c27 c59 c28 c60 c29 c61 c30 c62 8x4, 16 bit data c0 c8 c16 c24 c32 c40 c48 c56 c1 c2 c3 c4 c5 c6 c7 c15 c23 c31 c39 c47 c55 c63 c9 c10 c11 c12 c13 c14 c17 c18 c19 c20 c22 c21 c25 c26 c27 c28 c29 c30 c38 c37 c36 c35 c34 c33 c41 c42 c43 c44 c45 c46 c54 c53 c52 c49 c50 c51 c57 c58 c59 c60 c61 c62 8x8, 8 bit data 512 512 512 512 512 512 512 512 1024 1024 1024 1024 msb lsb msb lsb ip7:0 ip7:0 l7:0 l7:0 ip7:0 l7:0 ip7:0 l7:0 ip7:0 l7:0
pdsp16488a ma 11 loading registers from a host cpu the expansion data inputs [x14:0] on a single or master device are connected to the host bus to provide address and data for the internal registers. in a multiple device system the remaining devices receive addresses and data which have been passed through the expansion connection between earlier devices in the cascade chain. each device needs an individual chip enable plus a global data strobe, read/write line, and prog signal from the host. registers are individually addressed and can be loaded in any sequence once the global prog signal has been produced by the host. the latter would normally be produced from an address decode encompassing all the necessary device addresses. if a self timed system is to be implemented, a timing strobe must be passed down the expansion chain through the pc1 / pc0 connections. the pc0 output from the final device is used as a host reply signal, and indicates that the last device has received data after the propogation delay of previous devices. the timing strobe is produced in the master device from the host data strobe, and will appear on the pc0 output. this feature allows the user to cascade any number of devices without knowing the propogation delay through each device. the timing information for this mode of operation is given in figure 8. the host can also read the data contained in the internal registers. the required device is selected using chip enable with the r/ w line indicating a read operation. single device systems output the data read on x7:0, but in multiple device systems data is read from the d7:0 outputs on the final device in the chain. these must be connected back to the host data bus through three-state drivers. when earlier devices in the chain are addressed, the register contents are transferred through the expansion connections down to the final device. in the self timed configuration the data will be valid when the reply goes active, as shown in figure 8. if the reply signal is not to be used , the pc0 / pc1 connections are not necessary, and the host data strobe for a write operation must be wide enough to allow for the worst case propogation delay through all the devices ( tdel ). if the data or address from the host does not meet the set up time given in fugure 8, the width of the data strobe can be simply extended to compensate for the additional delay. when read- ing data the access time required is: tacc + ( n - 1 ).tdel using the maximum times obtained from figure 8. host control lines x7:0 8 bit data bus. in a single device system this bus is bi-directional; in other configurations it is an input. only a single or master device is connected directly to the host. other devices receive data from the output of the previous device in the chain. x14:8 7 bit address bus which is used to identify one of the 73 internal registers. connected in the same manner as x7:0. x15 x15 must be open circuit on the master device pc0 an input from the previous pc1 output in a multiple device chain. not needed on a single device or if the self timed feature is not used. pc1 reply to the host from a single device or from the last device in a cascade chain. it indicates that the write strobe can be terminated. connected to pc0 input of the next device at intermediate points in the chain if the self timed feature is used. r/ w read/not write line from the host cpu which is connected to all devices in the system. ce an active low enable which is normally produced from a global address decode for the particular device. this must encompass all internal register addresses. ds an active low host data strobe which is connected to all devices. in the system. prog an active low global signal, produced by the host, which is connected to all devices in the system. together with a unique chip enable for every de- vice, it allows the internal registers to be updated or examined by the host. prog and ce should be tied together in a single device system. loading registers from an eprom in the eprom supported mode, one device has to assume the role of a host computer. if more than one device is present, this must be the first component in the chain, which must have its master pin tied low. the master device contains internal address count- ers which allow the registers in up to 16 cascaded devices to be specified. it also generates the prog signal and a data strobe on the pins which were previously inputs. these outputs must be connected to the other devices in the system, which still use them as inputs. the r/w input should be tied low on all devices. the width of the data strobe is determined by the feedback connection from the pc1 output on the last device to the pc0 input on the master. the pc0 / pc1 connections must be made between devices in a multiple device system; in a single device system the connection is made internally. the available eprom access time is determined by an internal oscillator and does not require the pixel clock to be present during the programming sequence. any pixel clock re- synchronization in a real time system will thus not effect the coefficient load operation. the relevent eprom timing infor- mation is shown in figure 9. the load procedure will commence after reset has gone from active to in-active, and will be indicated by the prog output going active. the data from 73 eprom loca- tions will be loaded into the internal registers using addresses corresponding to those in table 3. within a particular page of 128 eprom locations, the first nine locations supply control register information, and the top 64 supply coefficients. the middle 55 locations are not used. if the window size is 8 x 4, the top 32 locations will also contain redundant data, and if the size is 4 x 4 the top 48 will be redundant.
pdsp16488a ma 12 in a multiple device system the load sequence will be repeated for every device, and four additional address bits will be generated on the cs3:0 pins. these address bits provide the eprom with a page address, with one page allocated to each device in the system. within each page only 73 locations provide data for a convolver, the remainder are redundant as in the single device system. the cs3:0 outputs must also be decoded in order to provide individual chip enables for each device. these can readily be derived by using an as138 ttl decoder. bits in an internal control register determine the number of times that the sequence is repeated. if changes to the convolver operation are to be made after power-on, activating the ce input on the master or single device will instigate the load procedure. additional eprom address bits supplied from the system will allow different filter coefficients to be used. eprom control lines x7:0 8 bit data from the eprom to the master or single device. otherwise data is received from the previous device in the chain. x14:8 lower 7 address bits to the eprom from a mas- ter or single device. otherwise an input from the data outs of the previous device. x15 tied to ground on a master device to indicate the eprom mode. r/ w tied low on all devices. ds an output from a master or single device which provides a data strobe for the other devices. cs3: 0 four additional address bits for the eprom which are provided by the master device. they allow 16 additional devices to be used and must be externally decoded to provide chip enables. . pc0 an input on the master device which is driven from the pc1 output of the last device in the chain. used internally to terminate the write strobe. con- nected to previous pc1 outputs at intermediate points in the chain. not needed for a single device. pc1 an output connected to the pc0 input of the next device in the chain. the last device feeds back to the master. not needed for a single device. ce an enable which is produced by decoding cs3:0 from the master. it is not needed for a master or single device which will always use the bottom block of addresses with internally gener- ated write strobes. it can however be used on these devices to initiate a new load procedure after the initial power on sequence. prog an active low going signal produced by an eprom supported master or single device. an input to all other devices. it indicates that a register load sequence is occuring, either after power on, or as the result of ce as explained above. it remains active until register 73 in the final device has been loaded. four bits in a control register define the number of cascaded devices. system configuration the device is configured using a combination of the state of the single and master pins, and the contents of the four mode control registers. in a master or single device the state of the x15 pin is used to define whether the system is eprom or host supported. mode control registers register a bit allocation bits 3:0 these bits are 'don't care' when using a host computer but to a master device, in an eprom supported system, they define the number of inter- connected chips. the eprom must contain con- tiguous 128 byte blocks for each of the devices in the system and a 4 bit counter in the master device will sequence through up to 16 block reads. an internal comparator in the master causes the loading of the internal registers to cease when the value in the counter equals that contained in these bits. the bits are redundant in a single device which only uses one 128 byte block. bits 6:4 these bits define one of the five basic configura- tions. the line delays will automatically be config- ured to match the chosen window size and pixel accuracy. the maximum clock rate that is avail- able to the user reflects the internal mutiplication factor. bit 3:0 6:4 6:4 6:4 6:4 6:4 7 7 code xxxx 000 001 010 011 101 0 1 function number of extra devices from1-15 8 bit, 8x8 window, 10mhz max, 8x512 line delays. 16 bit, 8x4 window, 10mhz max, 4x512 line delays. 16 bit, 4x4 window, 20mhz max, 4x512 line delays. 8 bit, 8x4 window, 20mhz max, 4x1024 line delays. 8 bit, 4x4 window, 40mhz max, 4x1024 line delays multiplexed exp. data non-mux. exp. data
pdsp16488a ma 13 register c bit allocation function field selection defined by c5:4 automatic field selection delop = 29 + 0 clks delop = 29 + 8 clks delop = 29 + 16 clks delop = 29 + 24 clks delop = 29 + 32 clks delop = 29 + 40 clks delop = 29 + 48 clks delop = 29 + 56 clks select upper 20 bits select next 20 bits select next 20 bits select bottom 20 bits by-pass the gain control normal gain control o/p saturate at max + and -ve values. force -ve to zero.sat.+ve values. code 0 1 000 001 010 011 100 101 110 111 00 01 10 11 00 01 10 11 bit 0 0 3:1 3:1 3:1 3:1 3:1 3:1 3:1 3:1 5:4 5:4 5:4 5:4 7:6 7:6 7:6 7:6 bit 7 this bit must be set if the pixel clock is greater than 20mhz. it disables the output and input time multiplexing, and instead outputs the least signifi- cant half of the 32 bit intermediate sum for the complete clock cycle. when the gain control is used, the output multiplexing will automatically be disabled. bit 0 this bit defines the input for the second group of line delays. it must be set in the 16 bit pixel modes, and is set by power on reset. bit 2:1 these bits control the mode of operation of the line stores. in real time systems pixels can be stored either until hres [ sync ] goes active , or until a pre-determined count is reached. in the frame store mode line store operations are continuous, with a pre-determined line length. bit 3 when this bit is set four pipeline delays are added to the pixel inputs to compensate for the internal/ external delays between line stores. the extra delay is only necessary when a device supplied with system video in which the first pixel in a line is valid in the period following the first active clock edge. see fig 7. the delay is not necessary if the device is fed from the output of another convolver. when set this bit will add four additional delays to those defined by register d, bits 4: 2. bit 4 when this bit is set the expansion adder will not be used. it is automatically set in a master or sin- gle device. bit 7 this bit controls the bypass option on the first line delay on the l7:0 inputs. it is only effective when an 8 bit pixel mode is selected, which also needs more than four line delays. when l7:0 are used as outputs it should always be reset. in the 16 bit modes the bypass function is only controlled by the bypass pin, and the bit is redundant. function second line delay group fed from the first group second line delay group fed from l7:0 which become inputs store pixels to end of line store pixels till count is reached frame store operation not used no delays on pixel inputs 4 delays on both pixel inputs use expansion adder expansion adder disabled not used use first delay in second group bypass first delay in second group bit 0 0 2:1 2:1 2:1 2:1 3 3 4 4 6:5 7 7 code 0 1 00 01 10 11 0 1 0 1 0 1 register b bit allocation bit 0 if this bit is set, the 20 bit field selected from the 32 bit result, is defined automatically by internal logic. bits 3:1 these bits are in conjunction with register d, bits 7:5 to define the pixel delay from the hres input to the delop pin. they are used to match the appropriate processing delay in a particular sys- tem. the minimum delay is 29 pixel clocks. bits 5:4 these bits define which of the four 20 bit fields out of the 32 bit final result is selected as the input to the gain control. they are redundant when the gain control is not used, or if register c, bit0, is set. bits 7:6 these bits define the use of the gain control as given in the table. intermediate devices in a mul- tiple device system must by-pass the gain con- trol, otherwise the additional pipeline delays will effect the result. disabling the scaler will reduce the device pipeline by 13 pclk cycles from the delays shown in table 4.
pdsp16488a ma 14 function x15:0 not delayed x15:0 delayed internal sum not shifted internal sum multiplied by 256 i/p to line stores not delayed i/p to line stores delayed by 4 i/p to line stores delayed by 8 i/p to line stores delayed by 12 un-signed pixel data input 2's complement pixel data input add 0 to 7 clock delays to delop output. code 0 1 0 1 00 01 10 11 0 1 xxx bit 0 0 1 1 3:2 3:2 3:2 3:2 4 4 7:5 register d bit allocation bit 0 if this bit is set the expansion data input is delayed by four pixel clocks before it is added to the present convolver output. it is used in multiple device systems when the partial window width is 8 pixels. bit 1 when this bit is set the internal sum is shifted to the left by 8 places before being added to the expan- sion input. it is used when two devices are used, each in an 8 bit pixel mode, to fabricate a 16 bit pixel mode. bits 3::2 these bits define the delays on both sets of pixel inputs before entering the line stores. the delays are always identical on both sets. bit 4 when this bit is set the convolver interprets 8 or 16 bit pixels as 2's complement signed numbers bit 7:5 these bits add 0 to 7 additional clock delays to those selected by register c, bits 3:1. absolute maximum ratings [see notes] supply voltage vcc -0.5v to 7.0v input voltage v in -0.5v to vcc + 0.5v output voltage v out -0.5v to vcc + 0.5v clamp diode current per pin i k (see note 2) 18ma static discharge voltage (hmb) 500v storage temperature t s -65 c to 150 c max. junction temperature military 150 c package power dissipation 3000mw thermal resistances, junction to case jc 5 c/w notes on maximum ratings 1. exceeding these ratings may cause permanent damage. functional operation under these conditions is not implied. 2. maximum dissipation or 1 second should not be exceeded, only one output to be tested at any one time. 3. exposure to absolute maximum ratings for extended periods may affect device reliablity. 4. current is defined as negative into the device. characteristic output high voltage output low voltage input high voltage input low voltage input leakage current input capacitance output leakage current output s/c current static electrical characteristics operating conditions (unless otherwise stated) t amb =-55 c to +125 c. v cc = 5.0v 10% note: signal pins pc0 , x15, master , single , bypass and 0v have pull-up resistors in the range 15k ? to 200k ? . signal pins prog and ds require external pull-up resistors in eprom mode. conditions i oh = 4ma i ol = -4ma except clk, res = 4v gnd < v in < v cc .no internal pull up gnd < v out < v cc .no internal pull up v cc = max min. 2.4 - 2.0 - -10 -50 10 value typ. 10 symbol v oh v ol v ih v il i in c in i oz i sc max. - 0.4 - 0.8 +10 +50 300 units v v v v a pf a ma subgroup 1,2,3 1,2,3 1,2,3 1,2,3 1,2,3 1,2,3 * * * * * ? * ? delay from output high to output high impedance test waveform - measurement level delay from output low to output high impedance h v 0.5v v 0.5v l 1.5v 0.5v 1.5v 0.5v delay from output high impedance to output low delay from output high impedance to output high v - voltage reached when output driven high v - voltage reached when output driven low h l
pdsp16488a ma 15 fig. 8. host timing pch t t exp t dsh pc1 from master or single device pc1 from last device (reply) hsu t valid address/data from the host host data o/p from first device t del valid hh t host data strobe chip enable datum t prog t psu t ph t wait t wait t pch > csu t ch t acc t rsu t ra r/w from the host coefficient output t rsu pc1 in-active delay after ds in-active symbol ra characteristic value units notes min. max. ds hold time after reply active host address/data set up time ns ns t t dsh hsu 20 0 host signal hold time expansion in to data out in prog m ode delay from strobe to pc1 ns chip enable set up time ns prog set up time ns ns ns t t t 50 ns t t t hh del exp csu psu pch 5 30 [equivalent to pc0 to pc1 delay ] 50 greater than tdel under all conditions 0 prog hold time chip enable hold time t ph ch t 0 0 0 coefficient read time coefficients valid time before reply acc t 50 5 ns ns only applicable if reply is used. otherwise time is referenced to risng edge of strobe when set up must be n xtdel, for n devices read set uptime to prevent write t 5ns only applicable for read ops & if reply is used. must always be guaranteed. no clocks are needed in prog mode defines data strobe in-active time from master or single device ? ? ? ? ? ? ? ? ? ? ? ? ? all parameters marked * are tested during production. parameters marked ? are guaranteed by design and characterisation
pdsp16488a ma 16 fig. 9. eprom timing t t t rw wh ww ds from master pc1 from master t pcd t valid t ch t t da t pch ad pc1 from last device [ pc0 to master ] eprom address eprom data ce valid valid valid data o/p from first device valid t del t del ds data o/p from second device t exp pc1 from next device t csu characteristic symbol value units notes min. max. delay from data strobe to master p c1 ns delay from pc0 input to write in-active ns pc1 in-active delay ns write from master in-active ns write in-active to new address ns eprom data set up time ns data strobe from master ns single device chip enable set up time availible eprom access time 200 ns 50 5 250 20 10 expansion in to data out pc0 to pc1 delay t t t t t t t t pcd wh pch ww ad ds rw t ch t da t del exp 0 50 30 30 50 ns ns ns greater than t at all t emps del chip enable hold time t csu 0 ns ? ? ? ? ? ? ? ? ? ? ? ? at all temps
characteristic symbol value units notes min. max. pixel clock low time (a) 32 bit muxed output (b) 16 bit output (a) 32 bit muxed output (b) 16 bit output 25 (a) 10 (b) 25 (a) 10 (b) ns ns t cl pixel clock high time t t t t t t t data in set up time data in hold time clk rising to output delay line store output delay hres in-active set up time output enable time output disable time 21 20 15 15 10 0 10 ns ns ns ns ns ns ns dsu dh rd ld rsu dlz dhz t ch sub group 9,10,11 9,10,11 9,10,11 9,10,11 9,10,11 9,10,11 9,10,11 9,10,11 9,10,11 increase to 25ns for delop output measured with a 15k ? series resistor and 30pf load capacitance * * * * * * ? ? ?
pdsp16488a ma 18 applications information device requirements the number of devices required to implement a given convolver window depends on the size of the window, the required pixel rate, and whether the pixel accuracy is to be 8 or 16 bits. in practice the pdsp16488a supports windows requiring one, two, four, six, or eight devices without addi- tional logic. table 2 gives typical window sizes which may be obtained with the above number of devices. figures 11 through 18 show system interconnections for these arrangements. other configurations are possible but may need the support of additional pixel/line delays and/or expansion adders. although not necessarily shown, all con- figurations can be supported by either an eprom or a host computer . interlaced or non-interlaced video may also be used, unless explicitly stated otherwise in the text. expansion with 8 bit pixels is a straightforward process and the number of devices needed is easily deduced from the window sizes available in a single device. at pixel rates above 20mhz it may not be practical to use more than four devices. since the full 32 bit intermediate precision is not available. the lack of expansion multiplexing reduces the intermediate pre- cision to 16 bits. the partial sum outputs must thus not overflow these 16 bits; this will require the coefficients to be scaled down appropriately with a resulting loss in accuracy. expansion with 16 bit pixels can be achieved in several ways. the simplest way is to use two devices, each working with 8 bit pixels. one device handles the least significant part of the data, and its output feeds the expansion input of a second device. this performs the most significant half of the calculation. the least significant half is then added to the most significant sum, after the latter has been multiplied by 256 ie shifted by eight places. this shift is done internally and controlled by register d, bit 1. the internal 32 bit accuracy prevents any loss in precision due the shift and add operation. the window size with this arrangement is restricted to that available in a single device, at the required pixel rate but with 8 bit pixels. thus two devices can be used , for example, to provide an 8 x 8 window with 16 bit pixels and 10 mhz rates. if a larger extended precision window is needed, it is possible to use four devices. each device is then programmed to be in a 16 bit data mode, but should be restricted to rates below 20 mhz, if the 32 bit intermediate precision is to be maintained. in the 16 bit modes, however, the output from the last line delay is not available due to pin limitations. this is not a problem in a four device interlaced system, since half of the devices will be fed from an external field delay. in non interlaced systems additional external line delays would be needed. an alternative approach would be to configure all the devices in the appropriate 8 bit mode, do separate least significant and most significant calculations, and then com- bine the results in an external adder after a wired in shift. single device systems figures 11 illustrates both eprom and host sup- ported single device systems, with or without interlaced video. in both cases the single and x15 pins must be tied tied low, and the pc0 , pc1 , and ds pins are redundant. the prog pin becomes an output and indicates that a register load se- quence is occuring. the first line delay must always be bypassed in a non interlaced system, however, since an internal pull up is not provided, the bypass pin should be tied to vcc for the correct operation. with interlaced video the bypass input is used to distinguish between the odd and even fields. the ce input may be left open circuit if coefficients are to be simply loaded after a power on reset signal; the latter being applied to the res input. alternatively the ce input may be used to change the coefficients at any time after power on reset; the eprom would then need additional address bits for the extra sets of coefficients that are to be stored. in an interlaced system the pixels from the previous field must use the ip7:0 inputs, and the live pixels must use the l7:0 inputs. interlaced sysytems requiring extended precision pixels are non supported with a single device, since the l7:0 inputs are then use for the least significant 8 bits, and the ip7:0 inputs for any more significant bits. if the x15 pin is left open circuit, an internal pull up will configure the device in the host supported mode. the host must then supply a data strobe and a r/ w control line. the x7:0 pins must be connected to the host data bus, and are used to both load and read back register values. the prog and ce pins may be connected together, and then driven by a host address decode. the output on pc1 , which provides a reply to the host, need not be used if the width of the data strobe is greater than the maximum texp value given in figure 7. the configuration bits 6:4 in register a define the window size, maximum pixel rate, and pixel resolution. win- dow sizes smaller than the maximum in any configuration are implemented by filling in the window with `zero' coefficients. bits 3:0 are irrelevent in the single mode, as is bit 7 if the gain contol is used. the result would be expected to lie in either the bottom 20 bits of the 32 bit result , or possibly in the next 20 bit field displaced by four bits. register c, bits 5:4, must thus select one of these fields for subsequent use by the gain control. the gain is then adjusted such that the 16 outputs available on pins are in fact the 16 most significant bits of the result. the gain needed is application specific, but if too much gain is used the ov pin will indicate an overflow. register b, bits 2:1, must be set to select the required method of defining the length of the line delays, and the use of bit 3 is dependent on any external pixel delays before the convolver input. no additional delays are needed on the pixel inputs in a single device system, and register d, bits 4:2, should be reset. the pipeline delay in the delop output path should match one of those in table 4, and is window size dependent. dual device configurations two devices, each configured with 8 bit pixels and 8w x 4d windows, can be used to provide an 8 x 8 window at up to 20 mhz pixel rates. figure 12 shows both the non interlaced and interlaced arrangements. video lines containing up to 1024 pixels are possible in both configurations, since each device only needs four line delays. one device is configured as the master by ground- ing the master pin; the other then receives control signals in
pdsp16488a ma 19 the normal way and has its master and single pins left open circuit. the internal convolver sum, in the device producing the final result, must be delayed by 4 pixels to match the inherent delay in the expansion output from the other device. this is actually achieved by delaying the pixel inputs to the line stores [ register d bits 3:2 = 01 ]. no additional delay in the expansion input is needed, but the pipeline delay used to produce delop must be four clocks greater than that given in table 4 for a single device. the delop output is redundant in one of the two devices. two devices can also be used to support systems requiring 16 bit pixels. with this approach the 16 x 8 multipli- cation is mechanized as two 8 x 8 operations, with the results added together after the most significant half has been shifted by 8 places to the most significant end. this shift operation is controlled by register d, bit 1. both convolvers are pro- grammed to contain the same coefficients. the convolved output can theoretically grow to 30 bits, and the appropriate field must be selected before using the gain control. examples of this operating mode are shown in figure 13. each device must be configured in the same 8 bit pixel operating mode, but the device producing the final result must use the 8 place shift option on its internal sum. the least significant 8 bits of the pixel are connected to the master device and the most significant 8 bits are connected to the device producing the final result.. the internal sum in this device must be delayed by four pixels to match the delay in the expansion output from the first device. this is actually achieved by delaying the pixel inputs to the line stores( register d, bits 4:2, = 001 ]. the expansion input needs no additional delay [ register d bits 1:0 = 10 ]. the actual pixel precision can be any number of pixels between 8 and 16, and may be a signed or unsigned number. any unused, more significant bits, must respectively be either sign extended or be tied low. delop must have four additional pipeline delays in order to match the total processing delay. this output can be obtained from either device. four device systems four devices, each in the 8x8 mode, can be used to provide a 16 x 16 window, with 8 bit pixel resolution and 10 mhz clock rates. the partial sum from the first device in each row must be delayed by eight pixel clocks before it is added to the result from the next device. this provides the eight pixel displace- ment to match the width of the window. the delay is actually provided by four additional delays in the expansion input to the next device, plus the inherent four clock delays in outputing results from the first device. register d, bit 0 controls the additional delay. the internal convolver sums, in the two devices in the second row, must be delayed by 12 clocks before they are added to the result from the first row. this twelve clock delay is necessary because of the combination of the eight pixel horizontal displacement delay , and the four clock delay in outputing the result from the last device in the top row. it is actually achieved by delaying the pixel inputs to the line stores. (register d, bits 3:2 = 11 ]. the delop output must have 20 delays additional to those in a single device. this compensates for the twelve delays added to the convolver sums in the second row, plus an additional eight delays to compensate for the partial width of the first device in the secind row. four devices can also be used to give an 8x8 window, but with a 30 mhz pixel clock. each device is configured to provide a 4x4 partial window, but the maximum pixel rate is reduced from 40 to 30 mhz because of the response of the line delay expansion circuitry. intermediate precision is restricted to 16 bits, since time multiplexed data outputs cannot be used above 20 mhz. this configuration requires no additional delay in the expansion inputs, and the inputs to the line stores in both devices in the second row must be delayed by 8 clock cycles [ register d bits 3:2 = 10 ]. the delop output needs twelve additional clock delays to match the processing delay. figures 14 and 15 show non-interlaced and interlaced versions of the above 8 x 8 and 4 x 4 arrangements figure 16 shows how four devices can also be used to provide an 8x8 window, with 16 bit pixels and 20mhz clock rates. the expansion data from a previous device needs no additional delay since the partial window size in each device is only 4x4. the internal convolver sums from each device in the second row must be delayed by 8 clks and the delop output must have 12 additional delays. if this arrangement is to be used in a non-interlaced application, the field store must be replaced by four line delays. six device systems as shown in figure 17, six devices, each in an 8wx4d mode using 8 bit pixels, can provide a 16w x 12d window at 20mhz clock rates. expansion inputs from previous devices in a row [but not the first device in each row] need an extra 4 clks of delay since the partial window is eight pixels wide. internal convolver sums need a differential delay of 12 clk cycles from row to row [ register d bits 3:2 = 11 ]. the delop output must have 32 additional delays to match the total processing delay. eight device systems two additional chips will extend the above six device configuration to a 16 x 16 window. internal convolver sums must have differential delays of 12 clock cycles between rows, as in the six device system. the delop output needs 44 additional clock delays. nine device systems nine devices each in the 8 x 8 mode will provide a 24 x 24 window with 8 bit data and 10 mhz pixel clocks. this is shown in figure 18. expansion data inputs from previous devices in a row [ but not the first device in each row ] need an extra 4 clks of delay. the internal convolver sums need differential delays of 20 clk cycles between rows. sixteen of the latter delays can be provided internally by setting register b, bit3, and also register d, bits 3:2. the four extra delays must be provided externally. the delop output needs 56 clock delays in addition to the 29 required for the 8 x 8 single device configuration.
pdsp16488a ma 20 eprom data x15 x14:8 x7:0 ov ip7:0 hres bypass l7:0 pdsp 16488 bin o/c gnd r/w prog ce d15:0 delop change coefficients res prog o/c sng clk mstr gnd bin gnd pixel data sync least sig byte of 16 bit pixel oen overflow clock data out delayed sync reset output enable addr host cpu reply addr data r/w address decode x15 ov ip7:0 hres bypass l7:0 o/c o/c r/w ce d15:0 delop res prog o/c pc1 sng ds clk mstr bin gnd pixel data sync least sig byte of 16 bit pixel oen bin overflow clock data out delayed sync reset output enable pc0 pdsp 16488 x14:8 x7:0 ds x15 x14:8 x7:0 ov ip7:0 hres bypass l7:0 pdsp 16488 pixel data o/c gnd sync r/w prog ce d15:0 delop res prog sng clk mstr gnd bin gnd oen bin overflow clock data out delayed sync change coefficients reset output enable field delay odd field eprom data addr host cpu reply addr data r/w address decode x15 x14:8 x7:0 ov ip7:0 hres bypass l7:0 pixel data o/c o/c sync r/w ce d15:0 delop res prog pc1 sng ds clk mstr bin gnd oen bin overflow clock data out delayed sync reset output enable pc0 pdsp 16488 field delay odd field ds figure 11 single device systems vcc vcc pdsp 16488a pdsp 16488a pdsp 16488a pdsp 16488a
pdsp16488a ma 21 eprom delayed sync clock bin overflow data out sync o/c msb o/c 8 bit pixel data reset o/p enable pc0 x15 x14:8 x7:0 ip7:0 hres bypass l7:0 ce d15:0 prog pdsp 16488 8x4 window gnd gnd res pc1 cs0 ds sng mst r/w gnd delop o/c gnd cs1 pc0 x15 x14:8 x7:0 ip7:0 hres bypass l7:0 ce d15:0 prog clock pdsp 16488 8x4 window res pc1 ds sng mst r/w oen o/c gnd o/c bin ov oen gnd delayed sync clock bin overflow data out sync o/c 8 bit pixel data reset o/p enable x15 x14:8 x7:0 ip7:0 hres bypass l7:0 ce d15:0 prog clock pdsp 16488 8x4 window o/c res pc1 sng mst r/w gnd delop o/c pc0 x15 x14:8 x7:0 ip7:0 hres bypass l7:0 ce d15:0 prog clock pdsp 16488 8x4 window res pc1 sng mst r/w oen o/c o/c bin ov r/w addr data strobe host cpu address decode ds ds odd field field delay reply read reg oen gnd o/c clock figure 12. 8 bit dual device systems vcc vcc pdsp 16488a 8x4 window pdsp 16488a 8x4 window pdsp 16488a 8x4 window pdsp 16488a 8x4 window
pdsp16488a ma 22 gc 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 sig n/c d0 oen bin pc1 vdd gnd over n/c hres r/w ce n/c n/c gnd n/c ds gnd vdd prog gnd cs3 cs2 cs1 cs0 vdd res pc0 n/c delop x0 x1 n/c gc 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 sig n/c x2 x3 x4 n/c x5 gnd x6 x7 n/c x8 x9 vdd vdd vdd x10 master n/c x11 x12 single gnd gnd n/c x13 x14 n/c x15 vdd bypass ip0 vdd n/c gc 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 sig n/c ip1 gnd ip2 n/c vdd ip3 vdd ip4 gnd ip5 gnd ip6 vdd ip7 vdd n/c l7 gnd l6 gnd l5 vdd l4 vdd l3 vdd l2 gnd l1 f1 l0 n/c gc 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 sig n/c vdd f0 d15 n/c d14 d13 gnd d12 gnd vdd vdd d11 d10 d9 gnd clk clk clk gnd gnd d8 vdd d7 d6 d5 d4 gnd d3 n/c d2 d1 n/c gc132 pin out table
pdsp16488a ma 23 eprom delayed sync clock bin overflow data out sync o/c msb o/c 16 bit pixel data lsb reset o/p enable pc0 x15 x14:8 x7:0 ip7:0 hres bypass l7:0 ce d15:0 prog pdsp 16488 8x4 window gnd res pc1 cs0 ds sng mst r/w gnd delop o/c gnd cs1 pc0 x15 x14:8 x7:0 ip7:0 hres bypass l7:0 ce d15:0 prog clock pdsp 16488 8x4 window res pc1 ds sng mst r/w oen o/c gnd o/c bin ov msb oen gnd o/c delayed sync clock bin overflow data out sync 16 bit pixel data reset o/p enable pc0 x15 x14:8 x7:0 ip7:0 hres bypass l7:0 ce d15:0 prog clock pdsp 16488 8x4 window o/c res pc1 sng mst r/w gnd delop o/c pc0 x15 x14:8 x7:0 ip7:0 hres bypass l7:0 ce d15:0 prog clock pdsp 16488 8x4 window res pc1 sng mst r/w oen o/c o/c bin ov r/w addr data strobe host cpu address decode ds ds odd field field delay reply read reg oen gnd o/c msb msb lsb lsb d7;0 clock figure 13. dual device 16 bit systems. vcc vcc pdsp 16488a 8x4 window pdsp 16488a 8x4 window pdsp 16488a 8x4 window pdsp 16488a 8x4 window
pdsp16488a ma 24 host cpu reply data delayed sync o/c o/c r/w prog pixel data sync decode x15 x14:8 x7:0 ip7:0 hres ce d15:0 prog bypass pdsp 16488 l7:0 r/w [master] res pc1 oen clk mst sng ds pc0 x15 x14:8 x7:0 ip7:0 hres ce d15:0 prog bypass pdsp 16488 l7:0 r/w res pc1 delop oen clk mst sng ds o/c gnd o/c gnd addr ds o/c o/c gnd o/c o/c o/c o/c gnd data out reset overflow bin o/p enable pc0 x15 x14:8 x7:0 bin ip7:0 hres ce d15:0 prog bypass pdsp 16488 l7:0 r/w res pc1 ov oen clk mst sng ds pc0 x15 x14:8 x7:0 ip7:0 hres ce d15:0 prog bypass pdsp 16488 l7:0 r/w res pc1 oen clk mst sng ds gnd gnd figure 14. four device non interlaced system. vcc vcc pdsp 16488a pdsp 16488a pdsp 16488a pdsp 16488a
pdsp16488a ma 25 eprom data delayed sync o/c o/c pixel data sync x15 x14:8 x7:0 ip7:0 hres d15:0 prog bypass pdsp 16488 r/w [master] res pc1 oen clk mst sng ds pc0 x15 x14:8 x7:0 ip7:0 hres ce d15:0 prog bypass pdsp 16488 r/w res pc1 delop oen clk mst sng ds gnd o/c gnd addr o/c o/c gnd data out reset overflow bin o/p enable pc0 x15 x14:8 x7:0 bin ip7:0 hres ce d15:0 prog bypass pdsp 16488 r/w res pc1 ov oen clk mst sng ds o/c o/c pc0 x15 x14:8 x7:0 ip7:0 hres ce d15:0 prog bypass pdsp 16488 r/w res pc1 oen clk mst sng ds o/c o/c gnd als 138 cs0 cs1 gnd field delay odd field gnd gnd gnd gnd pc0 upper addr bits figure 15. four device interlaced system. vcc vcc pdsp 16488a pdsp 16488a pdsp 16488a pdsp 16488a
pdsp16488a ma 26 host cpu reply data delayed sync o/c o/c r/w prog 16 bit pixel data sync decode x15 x14:8 x7:0 ip7:0 hres ce d15:0 prog bypass l7:0 r/w [master] res pc1 oen clk mst sng ds pc0 x15 x14:8 x7:0 ip7:0 hres ce d15:0 prog bypass l7:0 r/w res pc1 delop oen clk mst sng ds o/c gnd o/c gnd addr ds o/c o/c gnd data out reset overflow bin o/p enable pc0 x15 x14:8 x7:0 bin ip7:0 hres ce d15:0 prog bypass l7:0 r/w res pc1 ov oen clk mst sng ds o/c o/c pc0 x15 x14:8 x7:0 hres ce d15:0 prog bypass l7:0 r/w res pc1 oen clk mst sng ds o/c o/c gnd ip7:0 field delay msb msb lsb lsb msb lsb msb lsb odd field pdsp 16488 pdsp 16488 pdsp 16488 pdsp 16488 figure 16. four device system with 16 bit pixels vcc vcc pdsp 16488a pdsp 16488a pdsp 16488a pdsp 16488a
pdsp16488a ma 27 figure 17. six device non interlaced system. pdsp 16488 ce pdsp 1648 8 pco pdsp 1648 8 chip enables sync delayed sync prog prog prog gnd bin data out eprom addr data als 138 gnd gnd gnd gnd data in upper addr r/w r/w r/w o/c o/c o/c o/c [master] o/c hres res pc1 res reset pc1 pc1 res clock ce d15:0 bypass hres ip7:0 x15 x14:8 x7:0 bin ov clock bypass hres ip7:0 x15 x14:8 x7:0 d15:0 bypass hres ip7:0 x15 x14:8 x7:0 d15:0 ce clock pdsp 16488 res pc1 bypass hres ip7:0 x15 x14:8 x7:0 d15:0 ce pdsp 16488 pc1 bypass hres ip7:0 x15 x14:8 x7:0 d15:0 ce pdsp 16488 res pc1 bypass hres ip7:0 x15 x14:8 x7:0 d15:0 ce overflow o/p enable l7:0 pco l7:0 pco l7:0 pco pco pco cs0 cs1 res cs2 prog r/w clock l7:0 ds oen gnd gnd mstr gnd delop prog r/w clock l7:0 ds oen gnd gnd prog r/w clock l7:0 ds oen gnd gnd ds oen gnd gnd oen ds gnd oen ds gnd gnd vcc vcc pdsp 16488 pdsp 16488a pdsp 16488a pdsp 16488a pdsp 16488a pdsp 16488a pdsp 16488a
pdsp16488a ma 28 figure 18. nine device non interlaced system. gnd gnd data in o/c sync bin o/p enable data out gnd gnd o/c o/c o/c o/c gnd gnd gnd eprom upper addr addr data ce1 ce8 als 138 ce1 ce7 ce8 o/c o/c ce7 eprom upper address addr data reset delayed sync pdsp 16488 prog r/w res pc1 bypass hres ip7:0 x15 x14:8 x7:0 bin ov d15:0 ce clock pdsp 16488 prog r/w res pc1 bypass hres ip7:0 x15 x14:8 x7:0 bin ov d15:0 ce clock pdsp 16488 prog r/w res pc1 bypass hres ip7:0 x15 x14:8 x7:0 bin ov d15:0 ce clock pdsp 16488 prog r/w res pc1 bypass hres ip7:0 x15 x14:8 x7:0 bin ov d15:0 ce clock pdsp 16488 prog r/w res pc1 bypass hres ip7:0 x15 x14:8 x7:0 bin ov d15:0 ce clock pdsp 16488 prog r/w res pc1 bypass hres ip7:0 x15 x14:8 x7:0 bin ov d15:0 ce clock pdsp 16488 prog r/w pc1 bypass hres ip7:0 x15 x14:8 x7:0 d15:0 ce clock pdsp 16488 prog r/w res pc1 bypass hres ip7:0 x15 x14:8 x7:0 bin ov d15:0 ce clock pdsp 16488 prog r/w res pc1 bypass hres ip7:0 x15 x14:8 x7:0 bin ov d15:0 ce clock overflow l7:0 pc0 l7:0 pc0 l7:0 pc0 l7:0 pc0 pc0 pc0 l7:0 pc0 pc0 l7:0 pc0 cs0 cs1 cs2 res cs3 mst gnd ds gnd oen gnd oen ds gnd gnd ds oen gnd gnd gnd gnd ds oen gnd ds oen gnd ds oen gnd gnd ds oen gnd ds oen delop gnd gnd ds oen 4 clk delays 4 clk delays vcc vcc vcc pdsp 16488a pdsp 16488a pdsp 16488a pdsp 16488a pdsp 16488a pdsp 16488a pdsp 16488a pdsp 16488a pdsp 16488a
pdsp16488a ma 29 pin no. gc 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 volts. n/c n/c v1 n/c gnd v1 gnd n/c n/c v1 gnd v1 n/c n/c v1 n/c v1 gnd v1 gnd gnd n/c n/c n/c n/c v1 gnd n/c n/c n/c n/c n/c n/c volts. n/c n/c n/c n/c n/c n/c gnd n/c n/c n/c n/c n/c v1 v1 v1 n/c v1 n/c n/c n/c v1 gnd gnd n/c n/c n/c n/c n/c v1 gnd gnd v1 n/c volts n/c gnd gnd gnd n/c v1 gnd v1 v1 gnd v1 gnd v1 v1 v1 v1 n/c gnd gnd gnd gnd gnd v1 gnd v1 gnd v1 gnd gnd gnd n/c gnd n/c pin no. gc 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 volts. n/c v1 n/c n/c n/c n/c n/c gnd n/c gnd v1 v1 n/c n/c n/c gnd n/c n/c v1 gnd gnd n/c v1 n/c n/c n/c n/c gnd n/c n/c n/c n/c n/c pin no. gc 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 pin no. gc 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 part no: pdsp16488 single chip 2d convolver with integral line delays package type: gc132 vdd max = +5.0v = v1 n/c = not connected figure 19. life test/burn-in connections note: pda is 5% and based on groups 1 and 7 ordering information pdsp16488a ma gcpr (qfp package - c ompliant to mil-std-883 ) pdsp16488a ma acbr (pga package - c ompliant to mil-std-883)
www.zarlink.com information relating to products and services furnished herein by zarlink semiconductor inc. or its subsidiaries (collectively ?zarlink?) is believed to be reliable. however, zarlink assumes no liability for errors that may appear in this publication, or for liability otherwise arising from t he application or use of any such information, product or service or for any infringement of patents or other intellectual property rights owned by third parties which may result from such application or use. neither the supply of such information or purchase of product or service conveys any license, either express or implied, u nder patents or other intellectual property rights owned by zarlink or licensed from third parties by zarlink, whatsoever. purchasers of products are also hereby notified that the use of product in certain ways or in combination with zarlink, or non-zarlink furnished goods or services may infringe patents or other intellect ual property rights owned by zarlink. this publication is issued to provide information only and (unless agreed by zarlink in writing) may not be used, applied or re produced for any purpose nor form part of any order or contract nor to be regarded as a representation relating to the products or services concerned. the products, t heir specifications, services and other information appearing in this publication are subject to change by zarlink without notice. no warranty or guarantee express or implied is made regarding the capability, performance or suitability of any product or service. information concerning possible methods of use is provided as a guide only and does not constitute any guarantee that such methods of use will be satisfactory in a specific piece of equipment. it is the user?s responsibility t o fully determine the performance and suitability of any equipment using such information and to ensure that any publication or data used is up to date and has not b een superseded. manufacturing does not necessarily include testing of all functions or parameters. these products are not suitable for use in any medical products whose failure to perform may result in significant injury or death to the user. all products and materials are sold and services provided subject to zarlink?s conditi ons of sale which are available on request. purchase of zarlink?s i 2 c components conveys a licence under the philips i 2 c patent rights to use these components in and i 2 c system, provided that the system conforms to the i 2 c standard specification as defined by philips. zarlink, zl and the zarlink semiconductor logo are trademarks of zarlink semiconductor inc. copyright zarlink semiconductor inc. all rights reserved. technical documentation - not for resale for more information about all zarlink products visit our web site at

▲Up To Search▲

Price & Availability of PDSP16488AMA

	To Download PDSP16488AMA Datasheet File
If you can't view the Datasheet, Please click here to try to view without PDF Reader .